Compact Audio Representation for Event Detection in Consumer Media

نویسندگان

  • Xiaodan Zhuang
  • Stavros Tsakalidis
  • Shuang Wu
  • Pradeep Natarajan
  • Rohit Prasad
  • Premkumar Natarajan
چکیده

Local audio-visual descriptors are often compactly stored using representations such as the soft quantization histogram [1]. Typically, classification performance with histogram representations is improved through the use of large codeword sets. Unfortunately, this approach runs into overfitting and scalability challenges when applied to richly diverse real-world collections. A novel “i-vector” approach was recently proposed for the speaker-verification task [2]. In this work, we study the relative effectiveness of the i-vector as a compact representation of local audio descriptors (e.g., MFCC’s) within a multimedia event detection system. Specifically, we model the local audio descriptors using a Guassian Mixture Model (GMM). Following [2], we constrain the GMM parameters to a low-dimensional subspace while preserving most of the variability (i.e., information) in the descriptors. The GMM parameters in the subspace constitute a compact representation that exhibits robustness in the face of sparse data. We evaluate the method by performing the multimedia event detection (MED) task using only audio information within consumer (e.g., YouTube) videos. Experiments with the 2011 TRECVID MED data show that the i-vector provides superior performance and lower dimensionality than the bag-of-words soft quantization histograms used in the state-of-the-art BBN VISER system in the 2011 TRECVID MED Evaluation.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Robust audio-codebooks for large-scale event detection in consumer videos

In this paper we present our audio based system for detecting “events” within consumer videos (e.g. You Tube) and report our experiments on the TRECVID Multimedia Event Detection (MED) task and development data. Codebook or bag-of-words models have been widely used in text, visual and audio domains and form the state-of-the-art in MED tasks. The overall effectiveness of these models on such dat...

متن کامل

Event-based media processing and analysis: A survey of the literature

Research on event-based processing and analysis of media is receiving an increasing attention from the scientific community due to its relevance for an abundance of applications, from consumer video management and video surveillance to lifelogging and social media. Events have the ability to semantically encode relationships of different informational modalities, such as visual-audio-text, time...

متن کامل

DCAR: A Discriminative and Compact Audio Representation to Improve Event Detection

This paper presents a novel two-phase method for audio representation, Discriminative and Compact Audio Representation (DCAR), and evaluates its performance at detecting events in consumer-produced videos. In the first phase of DCAR, each audio track is modeled using a Gaussian mixture model (GMM) that includes several components to capture the variability within that track. The second phase ta...

متن کامل

Automatic detection of audio events indicating threats

This paper is focused on the area of audio event classification and detection for the purpose of citizens’ security in the urban environment. There are various acoustic/audio events, which occur during the possibly dangerous situations. The main goal of our work was to build a simple audio event detection system trained on a database of recordings and to test the often used approach based on MF...

متن کامل

Analysis of Compact Disc Digital Rights Management

Music publishers using the compact disc standard have introduced various copy protection schemes into the marketplace in the past several years, ostensibly in an effort to reduce sagging sales that the industry blames on illegal distribution of high-quality copies of compact discs. Over the course of three years, we performed a battery of tests against dozens of discs from various parts of the ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012